-
Notifications
You must be signed in to change notification settings - Fork 1.9k
WIP: Fixed column name\order in iris dataset #408
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I have switched the order and the name of two columns. Not sure, why it affected acc metric. @Ivanidzo4ka , @codemzs is that expected? |
Hi Olia,
You will need to update the order in which values are passed during prediction as well to ensure the ACC metrics are same. Do we really need to change the order in the dataset? In reply to: 400119754 [](ancestors = 400119754) |
Hi thanks @OliaG ... incidentally could we perhaps make changes in forks of this repository, rather than in this repository itself? |
I wanted to jump in and tackle this one as my first bug fix, but @OliaG it's yours :). I believe that also tests (TrainAndPredictIrisModelTest and TrainAndPredictIrisModelWithStringLabelTest) should be modified to have more realistic values when testing prediction (since those are shifted there as well). This is where I have encountered some strange behavior: for values (SepalLength, SepalWidth, PetalLength, PetalWidth) = (4.4, 3.1, 2.5, 1.2) predictions when the label is string (TrainAndPredictIrisModelWithStringLabelTest) and when the label is numeric (TrainAndPredictIrisModelTest) were different dramatically - I still haven't figured out if I was causing it for some reason (the dataset modifications were done the same way you did). Here is my repo's commit. |
@OliaG as Tom mentioned it would be great if you would do the PR against a fork. Can we perhaps close this PR (remove the branch) and take a PR against @bojanmisic fork? |
@shauheen sure, let's do it. I'll close this PR and @bojanmisic can create his. Also we will need to update our samples repo: classification and clustering examples that are using the same dataset. @bojanmisic you're welcome to do that too or I can do it. |
per issue #400
Fixed the iris dataset